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Abstract 

The semi-automatic or automatic synthesis of robot controller soft- 
ware is both desirable and challenging. Synthesis of rather simple behav- 
iors such as collision avoidance by applying artificial evolution has been 
shown multiple times. However, the difficulty of this synthesis increases 
heavily with increasing complexity of the task that should be performed 
by the robot. We try to tackle this problem of complexity with Artifi- 
cial Homeostatic Hormone Systems (AHHS), which provide both intrin- 
sic, homeostatic processes and (transient) intrinsic, variant behavior. By 
using AHHS the need for pre-defined controller topologies or information 
about the field of application is minimized. We investigate how the prin- 
ciple design of the controller and the hormone network size affects the 
overall performance of the artificial evolution (i.e., evolvability). This is 
done by comparing two variants of AHHS that show different effects when 
mutated. We evolve a controller for a robot built from five autonomous, 
cooperating modules. The desired behavior is a form of gait resulting in 
fast locomotion by using the modules' main hinges. 



1 Introduction 



The (semi-)automatic synthesis of robot controllers wit h artificial e voluti on be 



longs to the software section of evolutionary robotics (jCliff et all [1993) . The 



main challenge in this field is the curse of complexity because an increase in 
the difficulty of the desired behavior results in a significantly super-linear in- 
crease in the complexity of its evolution. This is partially docu mented by the 
absence of complex tasks in the literature ([Nelson et al 1 l2009h . Additionally, 



1 



in evolutionary robotics the cost of the fitness evaluation is rather high even in 
case of simulations, if the application of a physics engine (simulation of friction, 
inertia etc.) cannot b e avoided. Another chal lenge is the appropriate choice 
of a genetic encoding ( Mataric and Cliff . 19961) and the basic principle of the 
controller design as they define the designable fraction of the search space and 
the fitness landscape (non-designable fractions arc induced, for example, by the 
environment or the task itself). While the search space should be kept small, 
the fitness landscape should be smooth with a minimum number of local optima. 
Experience shows that these two criteria are contradicting. We summarize this 
complex of challenges by the aim to 'strive for high evolvability'. 
Concerning the problem of finding appropriate controller designs a pleasant 
trend can be observed in recent lit erature. The most prominent candidate is pre- 
sumably the HyperNEAT design (jStanlev et all 120091: IcTune et all I2009T) . It is 
based on artificial neural networks (ANN) but combines the 'search for appropri- 



ate n etwork weights with complcxification of the network structure' (jStanlev and Miikkulainen . 
20041) through the generation of connectivity patterns. It has proven to have 
good evolvability combined with an adequate range of applications. Other 
promising, recent approaches tend to be more inspired by biology, in partic- 
ular by unicellular organisms and endocrine syste ms. Examples showing good 
evolvability are the reaction-diffusion controller by iDale and Husbands! l2010h 



and homeo stasis and hormone syste ms based on GasNets ([Vargas et all 120091) 



and ANNs (jNeal and Timmisl . l2003l) . They indicate homeostasis as a prominent 
feature in successful adaptation to dynamic environments. 
In this paper, we analyze a controller design called Artificial Homeostatic Hor- 
mone S ystems (AHHS ) that is based on hormones only and was introduced 
before (|Hamann et a I [2010 : Schmickl et al. . 20101 Sc^mickr^n^Crailshei3, 
2009J [StradnereLalJ, [201(j, |2009|). AHHS is a reaction-diffusion approach. Sen- 



sory stimuli are converted into hormone secretions that, in turn, control the 
actuators. In addition, hormones interact linearly and non-linearly comparable 
to the hidden layer of ANN. The topology of this hormone-reaction network 
is not predefined. Such systems show homeostatic processes because they typi- 
cally converge to trivial equilibria for constant sensor input. The sensory stimuli 
are basically integrated in form of hormone concentrations (a form of memory) 
and decomposed over time (oblivion). However, during a limited period of 
time (transient) after a stimulus they show also variant behavior, especially, 
if non-linear hormone-to-hormone interactions are applied. This way, explo- 
rative behavior of the robot is implemented that allows for the testing of many 
sensory-motor configurations. The concept of AHHS is related to gene regula- 
tory networks. However, here each edge has its own activation threshold and 
redundant edges with different activations between two hormone s are allowed. 
The desired main appl i cation of AHHS is multi- modular robotics (jSYMBRION , 
20101: IREPLICATOrL l2010h . In this field, autonomous robotic modules are 



studied, that are able to physically connect to each other, and can also estab- 
lish a communication and energy connection. Hence, they form a super-robot 
called 'organism', tha t is able to re - config ure its body shape, see for example, 



Shen et al.1 (|2006l ) or iMurata et al.1 (|2008l ). Therefore, the underlying idea of 
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diffusion in our reaction-diffusion system is that hormones diffuse from robot 
module to robot module and establish a low-level communication. Following our 
maxim of trying to reach a maximum of plasticity we use identical controllers in 
each module independent of their position within the robot organism, so there 
is neither a controller nor a module specialization. This concept implements 
the focus of evolutionar y robotics on modularity (among others) in terms of 
hardware and software ( Nolfi and Floreano . 2004) . Although we evolve coop- 



erative behaviors by evolving a kind of self-organized role selection, there is no 
co-evolution. 

In general, our approach is more organic in contrast to the typical symbolic 
approach (direct encoding of pitch, roll, yaw angles, use of pattern generators 
using Gaussian functions etc.). The biological inspiration is not practiced as 
an end in itself but rather introduces more robustness in computations and 
it allows the diffusion of such values from module to module (implementing 
implicit communication) . 

One focus of our current research track is to design fitness landscapes by using 
appropriate controller designs. We investigate possibilities of smoothing the fit- 
ness landscape by a sophisticated interaction between the controller design and 
the mutation operator. We test whether it is useful to maximize the causality 
of the mutation operator (i.e., small causes have small effects) by reducing the 
maximal impact to the organism's behavi or. However, wh ether high causality 
is really desirable, is questionable (e.g., cf. Chouard ( 2010t )V 
The investigated scenario is a modular-robotics variant of gait learning in sim- 
ulation. Initially, we connect five modules in a simple chain formation as the 
body formation itself is not yet in our focus. The task is to move as far as 
possible by utilizing the hinge in each module only (no wheels). 



2 Artificial Homeostatic Hormone Systems 

In AHHS, sensors trigger hormone secretions, which increase hormone concen- 
trations in the robot. These hormones diffuse, integrate, decay, interact and 
finally, affect actuators . We have analyzed AHHS cont r ollers in single robots 
before dSchmickl et al. . 2010t Schmickl and Crailsheiml . l2009t Stradner et al 



2010tl2009h . In these cases, the robot's body was virtually divided into com- 



partments that hold hormones and between which hormones diffuse. These 
compartments create a spatial context (embodiment) by associating sensors and 
actuators with explicit compartments (e.g., left proximity sensor and left wheel 
actuator are associated with the left compartment and hence depend only on 
hormone concentrations of this compartment). In the case of modular robotics, 
the subdivision of the robot organism is naturally defined by the modules them- 
selves. A virtual compartmcntalization is not necessary and hormones diffuse 
from module to module (see Fig. [TJ. A fi rst small case study with organisms 
built from three modules was reported in ( Hamann et al. . 2010l) . 
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Figure 1: Sketch of the hormone dynamics and diffusion processes in an or- 
ganism. Each module holds different hormones with different concentrations, 
hormones diffuse through the organism based on a diffusion coefficient evolved 
individually for each hormone, module locations (e.g., elevation) are not rele- 
vant for diffusion; sensor settings simplified, actually four proximity sensors per 
module. 

2.1 AHHS1 

We c all the AHHS, initially presented in ( Schmickl et al. . 20f Ot Schmickl and Crailsheim 
20091) . AHHSf . An AHHS consists of a set of hormones and a set of rules. On 
the one hand, it defines production/decay rates and diffusion coefficients for each 
hormone. On the other hand, it defines by rules the production through sen- 
sors and interaction of hormones as well as their influence on actuators. There 
are four types of rules. Sensor rules define the production of hormone through 
sensor input. Actuator rules define the control of actuators through hormone 
concentrations. Hormone rules define the interaction between hormones, that 
is, one hormone triggers the production of another hormone (or itself). Ad- 
ditionally, there is an idle rule to allow a direct deactivation of rules through 
mutations. Rules are triggered at runtime, if a certain threshold is reached (sen- 
sor values in case of sensor rules or hormone concentrations in case of hormone 
rules) . The amount of produced hormone or the actuator control value are lin- 
early depending on the controlling sensor or hormone respectively ('Ax + k). 
For more details see Schmickl et al. ( 2010l ). 



2.2 AHHS2 

Based on AHHSf we designed an improved variant called AHHS2. The guiding 
principle of this improved controller design was to gain higher evolvability by 
creating smoother fitness landscapes. There were three main changes. 
First, we introduced an additional rule type that implements nonlinear hormonc- 
to-hormone interactions in the general form of Ax/ At = xy, where x is the 
considered hormone concentration and y is the hormone concentration of the 
influencing hormone that triggers the considered rule. The idea is to increase the 
intrinsic dynamics (basically transient behavior before equilibria are reached) of 
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the hormone network even without significant sensor input. 
Second, a rule is not just triggered by exceeding or falling below a threshold 
but is linearly weighted within a trigger window (i.e., a tent function with a 
maximum of 1, defined by a center and a width, see eq. [5] below). 
Third, the mutation of rule types in the form of discrete switches seemed to 
be too radical. This was overcome by introducing a concept of weights for rule 
types. Now, each rule can operate as any rule type at the same time. Each rule 
has a weight for each of the five rule types summing up to one (see Fig. [5]). The 
influence of a rule type is proportional to its weight, for example, the sensor-rule 
aspect of a rule with a weight of 0.1 will produce only 10% of the hormone it 
would produce, if its weight would be 1, see wc in eq. [1] below. A mutation will 
now only change two rule weights by reducing one by w and adding w to the 
other weight. In a well adapted controller we would expect that the weights of 
a rule are mainly concentrated on one or at most two rule types. Other weight 
distributions should be transitional only because specialization allows for better 
optimization. 

The mathematical closed-form of this concept using the example of a linear 
hormone rule type is 



C{t)=w c 6{H k {t)){\H k + K), (1) 

where C(t) is the hormone amount that is to be added to the considered hormone 
at time t, wc is the weight of the linear hormone rule (see Fig. (2), k is the index 
of the input hormone and H k is its concentration, A is the dependent dose, k 
the fixed dose. 9 is called trigger function and defined by 

, (x) = /^-l*-CI) KI-CK^ (2) 

[0 else 

for trigger window center ( and trigger window width ?/. For a more detailed 
introduction of AHHS2 and for a comparison of the AHHS approach to the 
standard ANN approach, see lHamann et aL ( 201dh . 



Note that the rule parameters (fixed dose, input hormone, trigger window etc.) 
are correlated via the rule types. For example, the input hormone is used for 
both the linear and the nonlinear hormone rule. If we would allow independent 
parameters for each rule type the genome (encoding of the controller) size would 
be increased by a factor of about three. This is a tradeoff in the complexity of 
the genome and, for example, a difficulty when anal yzing the results. This i s 
related to the completeness-vs-compactness challenge (jMataric and Chl . ll99fih . 



3 Investigated scenarios 

Our main focus is on the field of modular robotics and our main concern is 
whether we are able to evolve fast locomotion in the gait learning task. Still, 
we tested the AHHS approach also in an inverted pendulum task as well, due 
to its lower computational complexity. 
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Figure 2: Rule type weights of the AHHS2 approach compared to AHHS1 (ab- 
breviations: sensor rule, linear hormone rule, nonlinear hormone rule, actuator 
rule). 

3.1 Inverted pendulum 

In addition to the gait learning task, we tested the AHHS approach in a task 
that is easier to handle: balancing the inverted pendulum (see Fig. [3]). The com- 
putational demand of the gait learning task is very high due to the sophisticated 
simulation of physics. We satisfy the need for a simulation of lower computa- 
tional complexity by introducing the inverted pendulum task. Higher statistical 
significance of the results can be reached within reasonable time of computation. 
The original inverted pendulum is only slightly related to a real robotic task. 
Therefore, we adapted it to our requirements. The sensors are noisy (equally 
distributed and uncorrelated in time, ±2.3%) and sampling rates of sensors are 
low which is documented by the relation between the cycle length r and the 
maximal angular velocity of 0.057t[1/t] = 9°[l/r]. The pendulum can move up 
to 9° between two calls of the controller. The controller has little time to adapt 
to new configurations. Furthermore, the sensors do not deliver actual angles and 
positions directly but partitioned onto several sensors and also relative rather 
than absolute (distance to wall instead of the crab's position etc.). The AHHS 
controls two outputs, left actuator Aq and right actuator Ay, while the speed 
control of the crab is determined by their difference. The pendulum is started in 
the lower equilibrium position, so the nonlinear up-swinging phase is included. 
Combined with the sensor noise it is impossible for the controller to balance the 
pendulum in the upper equilibrium position. So the task stays dynamic and 
the controller is exposed to new situations constantly. The fitness function is 
the summation over all time steps of the angular distance to the top position in 
radians. 



G 



Figure 3: Inverted pendulum, pendulum free to move full 360° mounted on the 
crab that moves in one dimension (left/right) bounded by walls. 



3.2 Gait learning in multi-modular robotics 



Gait learning in legge d robotics is a comm only studied task in evolutionary 



robotics as reported bv lNelson et al 



However, here we investigate gait 
learning in multi-modular robotics. Each module consists of one hinge and we 
connect five modules. These five hinges are controlled decentrally although the 
modules have a low-level communication channel by means of diffusing hor- 
mones. 

In contrast to the standard tasks of gait learning and collision avoidance, the 
challenge of gait learning in multi-modular robotics is more complex. The re- 
sulting gait is emergent due to the decentral and cooperative control of the 
actuators. In addition, there are several conceptionally different solutions, that 
is, different techniques of locomotion with good performance (e.g., caterpillar- 
like, erected walk, small jumps). 

In each module the same controller is executed. Therefore, the gait learning 
task includes several sub-tasks. The organism has to break the symmetry (head 
and tail), synchronize through collective cooperation, and start moving into a 
common direction. This synchronization as pect is similar to th e gait learning 
task for a legged robot with HyperNEAT bv lClune etaH <|2009h . 
All of this work is based on simulations as the actual hardware is no t yet available 
(see Fig. |H for a current pro totype of Symbrion and Replicator (jSYMBRION . 
20101 IREPLICATORI bOld)) We use the simulation environment Symbrica- 
tor3D bv lWinkler and Worn (l2009h that was developed for these projects. We 
use the cu rrent prototype design in t he simulation (imported CAD data) as dc 



scribed in ( Levi and KernbachL 201oT ). However, we simplified the sensor setting 



to four proximity sensors (equally distributed around the robot shifted by 90 de- 
grees: upwards, forwards, downwards, backwards). Symbricator3D is based on 
the game engine Delta-3D and currently uses the Open Dynamics Engine for 
the simulation of dynamics. The simulation of friction and momentum is impor- 
tant because the evolved gait behaviors rely on them. A drawback is that high 
computational complexity limits the number of evaluations in our evolutionary 
runs. We are interested in systems that evolve useful behaviors within a few 
hundred generations and with small populations (order of 10). 
We have tested the AHHS controllers with two variants of the simulation frame- 
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Figure 4: Two conne cted prototypes of the projects 
REPLICATOR] f)2010h . 



SYMBRION 



(|2010ft and 



work. In the first version, the forces in the joints, that connect the modules, 
were damped and small displacements of the modules at the joints were allowed 
(i.e., simulation reacts moderately to big forces). It turned out that caterpillar- 
like locomotion was favored because the damped joints support wave motion. In 
the second version, the joints were fully fixed. In this version of the simulation 
the evolution of locomotion is more difficult which will be reflected by the best 
fitnesses in the following. 

We start the scenario with five robot modules which are simply connected in a 
chain. Initially this robotic organism is placed in the center of the arena. In 
order to increase the complexity of the gait learning task, the central area is 
surrounded by a low wall forming a square (its height is about half the height 
of a robot module) . Outside the wall several cubes are placed that could only 
be sidestepped by the organism. An identical robot controller is uploaded to 
the memory of all five modules. The robot modules have to figure out their 
position (their role within the configuration), that is, they have to break the 
symmetry of the configuration in order to generate a coordinated gait. This is, 
for example, possible because of different outputs of proximity sensors depend- 
ing on the modules' positions. There are three classes of modules defined by 
their characteristic sensor inputs: front module, back module, and modules in 
between. We use identical controllers because we want to apply them to dy- 
namic body shapes in our future work and also a single module should have all 
functionality. Hence, uploading heterogeneous controllers with predefined roles 
would not be an option. In addition, using self-organized role assignment will 
allow for high scalability (using the same controller for different body sizes), 
plasticity (reorganization of roles in changing body shapes) , and new role types 
might emerge that were unthought of by the human designer. 
The fitness is de fined by the covered distance of the organism. It is an aggregate 
fitness function ( Nelson et al. . 2009() that evaluates the organism's performance 
as a whole. Although the organisms might achieve advancements early in the 
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Figure 5: Inverted pendulum, AHHS1 with 60 rules, AHHS2 with 15 rules, 
comparison of fitness and evolution speed (generation when 75% of max. fitness 
was reached). 



evolutionary run, there is a bootstrapping problem. For example, the downward 
proximity sensors will not give significant input until the organism has figured 
out how to erect the modules in the middle. In addition, controllers cannot 
evolve special techniques to climb the wall before they have actually managed 
to move the organism there to explore it. 



4 Results and discussion 
4.1 Inverted pendulum 

The evolutionary runs of the inverted pendulum were performed with a popula- 
tion of 200 randomly initialized controllers. The AHHS was set to 15 hormones. 
For AHHS1 60 rules were used and 15 for AHHS2. The runs were stopped after 
200 generations. Linear proportional selection was used and elitism was set to 
one. The mutation rate was 0.15 per gene with a maximal, absolute change of 
range 0.1. The recombination (two-point crossover) rate was 0.05. 
For this task we configured AHHS with a left and a right compartment. The 
left compartment incorporates the left actuator Aq, the left proximity sensor, 
the sensors giving the angles of the pendulum when it is in the left half etc. and 
for the right compartment respectively. 

The comparison of the best controllers of each run is shown in Fig. |5(a)| In 
this scenario, AHHS2 performs significantly better than AHHS1 although in 
terms of evolution speed there is no significant difference (see Fig. |5(b)| . The 
AHHS2 design is the better choice in this task. The cause of the advantage of 
AHHS2 over AHHS1 in this task compared to the indistinct situation in the gait 
learning task is unclear. In future studies we will investigate whether this trend 
will also be observed in more complex tasks from the domain of multi-modular, 
evolutionary robotics. 

One of the best evolved AHHS2 controllers showing interesting behavior is an- 
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Figure 6: Inverted pendulum, analysis of one of the best evolved AHHS2 con- 
trollers; only most relevant rules of the evolved behavior are shown. 



alyzed in the following. While it is not possible to keep the pendulum in the 
upper equilibrium for longer time due to noise, the controller still tries to max- 
imize the time the pendulum is close to the upper equilibrium mostly by small 
displacements of the crab. The controller is mainly based on one hormone (Ho), 
and four rules (see Fig. [5]). Sensor Sq reaches its maximum, if the pendulum 
approaches = (top position) from the left. It triggers small displacements 
of the crab to the right, a behavior that keeps the pendulum turning counter- 
clockwise with slow passes at the top position. Sensor Sg gives the intensity 
of negative angular velocities of the pendulum (clockwise turns) and triggers 
moves of the crab to the left. The proximity sensors are not used at all. The 
walls are avoided by the crab movements depending on position and turning 
direction of the pendulum. Hence, the position of the crab is virtually encoded 
in the motion of the pendulum. 

Sec Fig. [7] for the sensor, hormone, and actuator dynamics. This sample run 
begins with an initial (t < 50) move of the crab from the center to the outer 



left due to transient dynamics of Ho in the left compartment (see Fig. 7(a)). 
This motion implements the up-swinging of the pendulum and is followed by 
ten small displacements of the crab to the right to keep the pendulum swinging 
counterclockwise. At t = 1093 the turning direction of the pendulum changes 
(see Fig. [7(b)]). A sequence of right-left movements is initiated to reestablish the 
counterclockwise turning. Later at t = 1933 a phase of low angular velocity is 
reached which causes irregular movements of the crab that hold the pendulum 
close to the top position. 



^http: //heikohamann . de/pub/hamannEtAlAlif e2010pend.mpg 
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3000 

(a) most relevant hormone Ho (upper and lower half, red), actuator left Ao (upper half, 
black), right A\ (lower half, black) 




(b) pendulum angle sensor So for < <j> < tt/2 (purple), negative angular velocity sensor 
Sg (lower half, yellow) 



Figure 7: Inverted pendulum, most relevant hormone, sensors, and both ac- 
tuator control values for both compartments (left and right) of the evolved 
behavior. 



4.2 Gait learning 

The evolutionary runs of the gait learning task were performed with a population 
of 20 randomly initialized controllers. The configuration of the AHHS was set 
to 5 hormones. The number of rules was varied between 20 and 300. The runs 
were stopped after 200 generations. Linear proportional selection was used and 
elitism was set to one. The mutation rate was 0.15 per gene (rule or hormone, 
with a maximal, absolute change of range 0.1). The recombination (two-point 
crossover) rate was 0.05. One run of the evolution (full 200 generations) took 
about 28 hours of CPU time (on a single core of a standard, up-to-date desktop 
PC). 

In the first version of the simulation (damped joints), the evolved behaviors 
reach high fitness values for all investigated settings of the AHHS (see Fig. [5]). 
Directly approaching the wall yields a fitness of about 0.7, getting one half of the 
modules over the wall yields a fitness of 0.8, and a fitness of above 1 is reached, 
if the wall is overcome. Typically the evolved behaviors rely on two or three of 
the five provided hormones only and make use of less than ten rules. However, 
a too low number of rules results in too little exploration of the behavior space. 
Based on preliminary tests we decided to use 30 rules for AHHS2. One AHHS2 
rule is potentially active for each rule type, which corresponds to four active 
AHHS1 rules. However, AHHS2 cannot optimize the parameters for each rule 
type individually. Still, we tested the AHHS1 with 120 rules and also with a 
much higher number of 300 rules. The results show no statistical significant 
differences but show in a trend that the AHHS1 docs not reach comparable 
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Figure 8: 5-module gait learning with damped joints, comparison of fitness and 
evolution speed, which is indicated by the generation in which 75% of the overall 
max. fitness (1.41 = 0.75 x 1.88) was reached (if at all). 



results as AHHS2 with corresponding rule numbers. In addition, the behaviors 
evolved by AHHS1 show high variance depending on the deterministic chaos 
through the complex system (simulation of physics). 

Using the second version of the simulation (fixed joints), we have tested smaller 
differences in the number of rules between AHHS1 and AHHS2. The results 
show that the more realistic simulation of the joints complicates the evolu- 
tion of fast locomotion. However, the favoring of caterpillar-like locomotion is 
reduced significantly and especially in case of AHHS2 an unexpected vast di- 
versitjQ of different locomotion paradigms is observed (see Fig. |H] for a short 
collection) . Basically we observed three classes of locomotion: erected walking 
behavior, caterpillar-like locomotion, and locomotion through jumps. The be- 
haviors evolved using AHHS1 were less diverse. Quantifying these differences 
will be the focus of future studies. 



The comparison of the best evolved behaviors is shown in Fig. 10(a) and the 
speed of evolution is shown in Fig. |10(b)] 55% of the AHHS2-runs with 50 rules 
and 38% of the AHHSl-runs with 80 rules reach a best fitness that is within 
80% of the theoretical maximum fitness of about 1.7. Significant results are 
only reached for AHHS1 with 20 rules compared to both AHHS1 with 80 rules 
and to AHHS2 with 50 rules. Noticeable is the bad performance of AHHS2 with 
just 20 rules both in terms of final best fitness and speed of evolution. From our 
observations we speculate that the initial exploration (during few of the early 
generations) of the search space (basically the sensory-motor configurations) is 
a relevant feature. Identifying the actual shortcoming of AHHS2 in this context 
is part of our future research. 

One important aspect in the differences between the two controller types seems 
to be the different triggering of rules in AHHS1 and AHHS2. The behaviors 
of AHHS1 clearly show more fast-paced movements. With damped joints this 
seems to be a disadvantage as smooth movements are less likely. Using the fixed 
joints this sometimes results in fast locomotion through little jumps. 



http: / /heikohamann . de/pub/hamannEtAlAlif e2010 .mpg 
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The evolved structures are complex and the underlying processes are often 
counter-intuitive. The in-depth analysis of individual behaviors is alleviated by 
considering the number of steps a rule has been active (triggered). Typically, 
about one third of the rules trigger never or very seldom. 

4.3 Post-evaluation and analysis 

We have investigated the behavior of one of the best evolved AHHS2 controllers 
in the second version of the simulator. It shows a dynamic caterpillar-like mo 
tiorH. It is noticeable that the rules show characteristics of specialization and 
optimization. For example, often the (floating) index of the output hormone is 
close to an integer (i.e., the rule's effect is mostly limited to one hormone) and 
often rule weights are above 0.5 showing the specialization of those rules. For 
the investigated controller we have identified three most relevant hormones: Hi , 
H3, and H^. The angle of the hinge is mainly controlled by hormones H3 and 



Hi (see Fig. 11(a) High values of H4 turn the hinge towards +90° while any 
value of i?3 > turns the hinge towards —90°. As a reinforcing effect there is a 
hormone rule that decreases H4, if H3 > 0. H2 shows the influence by diffusion 



of hormones through the organism (see Fig. 11(b) A decreasing concentra- 
tion in the back module is consequently followed by a decrease in the second 
last, middle, and second first module, hence, forming a hormone wave that is 
propagating through the organism. Finally, we investigated the influence of 
mutations. The leading design paradigm of AHHS2 was to improve the causal- 
ity of the mutation operator (small changes in genome result in small changes 
in the behavior). This was done exemplarily by taking an evolved controller 
from each type. For both we produced 35 controllers by applying the mutation 
operator once for each. The evaluated fitnesses of these 35 controllers are shown 
as a histogram in Fig. [T2] For AHHS1 the majority of mutated controllers had a 
fitness of less than 0.2. For AHHS2 the majority of mutated controllers reached 
about the original fitness. For both types some controllers reached higher fitness 
due variance introduced by deterministic chaos in the simulated physics. 



5 Conclusion and Outlook 

We have reported the application of our hormone control approach to the do- 
main of evolutionary modular robotics. The automatic synthesis of controllers, 
that facilitate locomotion of organisms built from five robot modules, has been 
effective in a majority of the evolutionary runs. Almost all evolved controllers 
are able to generate a form of locomotion that takes the organism at least to the 
wall. A majority of the evolved controllers were able to overcome the wall. An 
unexpected vast diversity of locomotion paradigms was evolved especially in the 
second version of the simulation. On the one hand, this shows the complexity 
of the gait learning task in modular robotics because there are many solutions 
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of similar utility. On the other hand, it shows the diversity of behaviors repre- 
sentable by AHHS controllers. 

Whether the redesigned controller AHHS2 is generally superior to the original 
AHHS1 design is still an open question. However, in case of the inverted pendu- 
lum it performs significantly better. In the gait learning scenario AHHS2 shows 
a higher diversity and behaviors with smoother movements resulting in more 
reliable locomotion. 

There are many open issues and this research track is rather at its beginning. 
Our future research will include the following. The different possibilities of ini- 
tializations need to be investigated extensively. For example, the controllers 
could be initialized with specialized sensor, hormone, and actuator rules (i.e., 
weights of 1). Scalability and more complex tasks from the domain of modu- 
lar robotics will be investigated (e.g., organisms with more modules). We plan 
to use environmental incremental evolution (e.g., steadily increasing heights of 
walls) as reported by iNakamura et aL ( 200Clh . The dynamic adaptation of rule 
numbers by evolution will be investigated. Hence, we will evolve hormone re- 
actio n networks through complcxification similar to (jStanlev and Miikkulainenl . 
20041) . Finally, we plan to check the controllers' exploration of the sensory-motor 



space, especially, during the initial generations to get a better understanding of 
what facilitates a high diversity of solutions. 
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(c) independent hinges (d) caterpillar-like 




(e) jumping (f) warping over the wall 



Figure 9: Screenshots showing the diversity of evolved locomotion paradigms 
(colors represent three selected hormones in the primary colors according to the 
RGB color model). 
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(a) fitness (Wilcoxon p < 0.05) 
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(b) generation (Wile, p < 0.05) 



Figure 10: 5-module gait learning with fixed joints, comparison of fitness and 
evolution speed, which is indicated by the generation in which 75% of the overall 
max. fitness was reached (if at all). 
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(a) Most relevant hormones Hg (black) and H4 (purple), and hinge control angle <j> (yellow). 
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(b) Hormone H2 in all five modules, demonstrating the effect of diffusion (from front module 
to back: light to dark). 



Figure 11: 5-module gait learning with fixed joints, analysis of the evolved 
behavior. 
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Figure 12: Fitness landscape neighborhood, fitness histogram of 35 samples of 
mutated controllers, fitness of the original controller is for AHHS1: 0.84, for 
AHHS2: 0.81. 
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